【Day 24】Agent 入門指南：從零開始構建智能代理

2024 iThome 鐵人賽

DAY 24

生成式 AI

T 大使 AI 之旅系列第 24 篇

16th鐵人賽

我的狗狗叫饅頭

2024-08-28 23:54:19

597 瀏覽

分享至

前情提要

前兩天我們了解現在技術是如何微調 LLMs，實作了幾個框架之後，也漸漸上手了微調 LLMs。那今天要來看看應該是生成式 AI 未來的趨勢，那就是 Agent(代理)，那接下來就來好好看看 Agent 吧！

Tools

在使用 Agent 前，我覺得要先了解 Tools，Tools 就是 Agent 用來完成特定任務的 Function。這些 Tools 可以使用 LangChain 內建的 Toolkits 函式，也可以從 community 找尋第三方的外部工具，像是 Google 搜尋 (付費)，當然也可以自己自訂 Tools 的任務，那在設計 Tools 的時候需要給足兩個條件：

提供正確的工具：這個任務要有對應的目標，需具備特定的用途和功能。
工具的使用說明：必須描述清楚工具的用途和使用時機，以利 Agent 在正確的時機更有效的使用工具。

實戰🔥

from langchain_core.tools import tool

@tool
def circle_area(radius):
	"""
	calculates the area of a circle
	
	Args:
		radius: the radius of the circle
	"""
	return 3.14 * float(radius) ** 2

程式碼結果探討 🧐：

程式中可以看到，套件的部分會成為函式的 Decorator。在函數註解的部分要說明這個函數的用途和參數，那我這邊任務是計算 圓面積，傳入的參數是 半徑，最後回傳計算的結果。

Agent 是什麼

前面看到了如何設定要給 Agent 的 Tools，那在實作之前，簡單介紹一下，Agent 就是讓 LLMs 有自己規劃、解決複雜問題的能力，讓 LLMs 的行為和人的行為相近。
以上面的圓面積為範例，如果我給他一個半徑要他模型幫我計算圓面積，他就會用我提供的算是來計算，因為這個 Tools 是他執行任務的工具。

實戰🔥

我測試 Google、OpenAI 和台智雲的語言模型，我目前只有 OpenAI 的模型可以順利使用。我不知道是 LangChain 的 create_react_agent 有問題還是是我的問題。但也有可能是封裝台智雲語言模型的部分要更新封裝的結構，後續再來研究研究，LangChain 真的一直在改版，我上一個月還可以正常使用 Agent 🙄

Tools

from langchain_core.tools import tool

@tool
def tool_note(note):
	"""
	saves a note to a local file
	Args:
		note: the text note to save
	"""
	with open("notes.txt", "a") as f:
	f.write(note + "\n")

@tool
def circle_area(radius):
	"""
	calculates the area of a circle
	Args:
		radius: the radius of the circle
	"""
	return 3.14 * float(radius) ** 2

@tool
def triangle_area(a_side, b_side, c_side):
	"""
	calculates the area of a triangle
	Args:
		a_side: the length of side a
		b_side: the length of side b
		c_side: the length of side c
	"""
	a = float(a_side)
	b = float(b_side)
	c = float(c_side)
	s = (a + b + c) / 2
	area = (s * (s - a) * (s - b) * (s - c)) ** 0.5
	return area

程式碼結果探討 🧐：

這邊設定了三個工具：
- 一個是計算圓面積
- 一個是計算三角形面積
- 一個是將模型回傳的內容寫入 txt 檔

Agent

# 匯入套件
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents import create_openai_functions_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from tools_in_ithome import circle_area, triangle_area, tool_note
from dotenv import load_dotenv
load_dotenv()

# 選擇模型和 Tools
model = ChatOpenAI(model="gpt-4o")
tools = [circle_area, triangle_area, tool_note]

# 設定 prompt
prompt = ChatPromptTemplate.from_messages(
	[
		("system", "You are a helpful assistant"),
		("human", "{input}"),
		MessagesPlaceholder(variable_name="agent_scratchpad")
	]
)

# 建立 agent
agent = create_openai_functions_agent(
	llm=model,
	prompt=prompt,
tools=tools,
)

# 設定 Agent 的 Chain
agent_executor = AgentExecutor(
	agent=agent,
	tools=tools,
	verbose=True
)

程式碼結果探討 🧐：

Prompt 和選擇模型的部分就跟之前一樣，只是這次是使用 OpenAI，但要注意要 prompt 的最後加入 agent_scratchpad 的 MessagePlaceHolder 給這個參數。
接著就使用 create_openai_functions_agent 來創建 Agent，這個動作就是之前 Chain 的動作，只是需要指定 Tools。
最後將 Chain 起來的 Agent 放入 AgentExecutor，verbose 設定為 True 的話可以看到模型的推理過程，並顯示有無使用 Tools。

response = agent_executor.invoke({"input": "你能幫我計算出半徑為2的圓面積嗎？"})

程式碼結果探討 🧐：

輸出有紅色框框那個樣子，就代表有使用到 Tools。

response = agent_executor.invoke({"input": "你能幫我計算出邊長為3、4、5的三角形面積，並且將其寫入筆記中。"})

程式碼結果探討 🧐：

Agent 也可以一個個接著執行，可一次執行多個 Agent，可以看到內容也成功的被寫入 txt 檔。

在 create_openai_functions_agent 的部分換成 create_tool_calling_agent 也是可以有一樣的效果哦～

拆解結構

實作過了之後，來看看 LangChain 是如何將這些 Chain 封裝成物件，探討探討 Agent 的 LCEL 結構是如何 Chain 起來的。

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
print(agent_executor)

程式碼結果探討 🧐：

可以看到這個函數其實底下就是 LCEL 架構，使用 Runnable 是 RunnableMultiActionAgent，那他包裝起來的 runnable=RunnableAssign 看起來就是前面的建立的 agent = create_tool_calling_agent(model, tools, prompt)。
第一個 Chain 可以看到 MessagesPlaceholder(variable_name="agent_scratchpad") 這個部分有被傳入一個 intermediate_steps，他是在對話過程中記錄的中間步驟。
第二個 Chain 就是我們設定的對話模塊 ChatPromptTemplate
第三個 Chain 的 RunnableBinding 簡單來說就是將 LLM 和我們設定的 Tools 給綁定起來。
最後一個 Chain 是 ToolsAgentOutputParser 用來處理模型輸出時涉及的工具調用結果，他會確保 Tools 調用結果能夠正確返回並融入到最終的輸出中。

結論

目前除了 OpenAI 的語言模型我都不成功，如果有人有什麼知道如何使用 OpenAI 以外的方式或程式，可以留言教教我～另外今天使用簡單定義一些函數來讓 AI 判別什麼時候需要使用什麼 Tools，然後也有一個 invoke 使用多個 Agent。也有看看 Agent 中的 Chain 到底是如何鏈接，原本想自己實作一個 LCEL 的 Chain，變數都自己傳入，但看到這個還是用 LangChain 包裝好的就好了😆。那明天的實作我們就加上 RAG 和 Memory，還有我覺得目前地表最強的 Tools - Tavily AI。